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Abstract 

Goodness-of-fit tests gauge whether a given set of observations is consistent (up to 
expected random fluctuations) with arising as independent and identically distributed 
(i.i.d.) draws from a user-specified probability distribution known as the "model." 
The standard gauges involve the discrepancy between the model and the empirical 
distribution of the observed draws. Some measures of discrepancy are cumulative; 
others are not. The most popular cumulative measure is the Kolmogorov-Smirnov 
statistic; when all probability distributions under consideration are discrete, a natural 
noncumulative measure is the Euclidean distance between the model and the empirical 
distributions. In the present paper, both mathematical analysis and its illustration 
via various data sets indicate that the Kolmogorov-Smirnov statistic tends to be more 
powerful than the Euclidean distance when there is a natural ordering for the values 
that the draws can take — that is, when the data is ordinal — whereas the Euclidean 
distance is more reliable and more easily understood than the Kolmogorov-Smirnov 
statistic when there is no natural ordering (or partial order) — that is, when the data 
is nominal. 
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1 Introduction 



Testing goodness-of-fit is one of the foundations of modern statistics, as elucidated by iRao 



( 120021 ). for example. The formulation in the discrete setting involves n independent and 
identically distributed (i.i.d.) draws from a probability distribution over m bins ("categories," 
"cells," and "classes" are common synonyms for "bins"). In accordance with the standard 
conventions, we will use p to denote the actual (unknown) underlying distribution of the 
draws; p = (p^\p^ 2 \ . . . ,p (m )), with p^\ p( 2 \ . . . , p^ being nonnegative and 



p (j) 



We will use po to denote a user-specified distribution, usually called the "model"; again 
Po = (Po ,Po ; ■ ■ ■ iPq 1 ^)-, with p^\ p^\ . . . , p^ being nonnegative and 

m 

EPo } = !" (2) 

A goodness-of-fit test produces a value — the "P-value" - - that gauges the consistency 
of the observed data with the assumption that p = po- In many formulations, the user- 
specified model po consists of a family of probability distributions parameterized by 9, where 
9 can be integer-valued, real-valued, complex-valued, vector-valued, matrix-valued, or any 
combination of the many possibilities. In such cases, the P-value gauges the consistency of 
the observed data with the assumption that p = Po{0), where 9 is an estimate (taken to 
be the maximum-likelihood estimate throughout the present paper). We now review the 
definition of P- values. 

P- values are defined via the empirical distribution p, where p = (p^^^, . . . ,p^), with 
being the proportion of the n observed draws that fall in the j'th bin, that is, p^' is 
the number of draws falling in the jth bin, divided by n. P-values involve a hypothetical 
experiment taking n i.i.d. draws from the assumed actual underlying distribution p = Pq{9). 
We denote by P the empirical distribution of the draws from the hypothetical experiment; we 
denote by G a maximum-likelihood estimate of 9 obtained from the hypothetical experiment. 
The P-value is then the probability that the discrepancy between the random variables P 
and po(@) is a t least as large as the observed discrepancy between p and Po(9), calculating 
the probability under the assumption that p = Po(9). 

To complete the definition of P-values, we must choose a measure of discrepancy. In the 
present paper, we consider the (discrete) Kolmogorov-Smirnov and Euclidean distances, 



di(a, b) = max 

Kfc<m 



i) 



(3) 



2 



and 



d 2 (a, b) 



V 



(4) 



respectively. The P-value for the Kolmogorov-Smirnov statistic is the probability that 
di(P,p (Q)) > di(p,po(9)); the P-value for the Euclidean distance is the probability that 
( ^2(-P ) Po(0)) > d2(p,Po(9)). When evaluating the probabilities, we view P and 6 as ran- 
dom variables, constructed with i.i.d. draws from the assumed distribution p = po(6), while 
viewing the observed p and 9 as fixed, not random. 

If a P-value is very small, then we can be confident that the given observed draws are 
inconsistent with the assumed model, are not i.i.d., or are both inconsistent and not i.i.d. 

Needless to say, the Kolmogorov-Smirnov distance defined in ([3]) is the maximum absolute 
difference between cumulative distribution functions. The Kolmogorov-Smirnov statistic 
depends on the ordering of the bins, unlike the Euclidean distance. 

As supported by the investigations below, we recommend using the Kolmogorov-Smirnov 
statistic when there is a natural ordering of the bins, while the Euclidean distance is more 
reliable and more easily understood than the Kolmogorov-Smirnov statistic when there is no 
natural ordering (or partial order). Unlike the Euclidean distance, the Kolmogorov-Smirnov 
statis t ic uti lizes the information in a natural ordering of the bins, when the latter is available. 



Horn! (119771 ) gave similar recommendations when comparing the x 2 an d Kolmogorov-Smirnov 
statisti cs. Detailed comparis ons between the Euclidean distance and x 2 statistics are avail- 



able in Perkins et al 



(I2011af ). 

The Kolmogorov-Smirnov statistic is cumulative; it accentuates low-frequency differences 
between the model and the empirical distribution of the draws, but tends to average away 
an d otherwise obscure h igh-fr e quency differences. Simil a r obs e rvations have been mad e 
bvlPettitt and Stephen] fll977h iD'Agostino and Stephensl (Il986h Ichoulakian et all (Il994f). 
Froml fll996h.lBest and Ravnerl (ll997hT lHaschenburger and SpineUilfl2005h. [Steele and Chaseling 



fl2006h . lLockhart et all fl2007h . lAmpadul d2008h . and IXmpadu et alll2009h . among others. 
Our suggestions appear to be closest to those of iHornl (119771 ). 

There are many cumulative approaches similar to the Kolmogorov-Smirnov statistic. 
These include the Cramer-von-Mises, Wats on, Kuiper, and Re n yi statisti c s, as well a s their 
Ander son-Darling variants; Section 14.3.4 of Press et al. ( 2007 ). Stephens ( 1970l ). and Renyi 
(119531 ) review these statistics. We ourselves are fond of the Kuiper approach. However, the 
present paper focuses on the popular Kolmogorov-Smirnov statistic; the Cramer-von-Mises, 
Watson, and Kuiper variants are very similar. 

The remainder of the present paper has the following structure: Section [2] describes how 
the Euclidean distance is generally preferable to the Kolmogorov-Smirnov statistic when 
there is no natural ordering (or partial order) of the bins. Section [3] describes how the 
Kolmogorov-Smirnov statistic is generally preferable to the Euclidean distance when there 
is a natural ordering of the bins. Section H] illustrates both cases with examples of data 
sets and the associated P-values, computing the P-values via Monte-Carlo simulations with 
guaranteed error bounds. The reader may wish to begin with Section HJ referring back to 
earlier sections as needed. 
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2 The case when the bins do not have a natural order 



The Euclidean distance is generally preferable to the Kolmogorov-Smirnov statistic when 



there is no natural ordering (or partial order) of the bins. As discussed by iPerkins et al. 
(j2011bl ). the interaction of parameter estimation and the Euclidean distance is easy to 



understand and quantify, at least asymptotically, in the limit of large numbers of draws. 
In contrast, the interaction of pa rameter estimation and t he K olmogorov-Smirnov st atistic 
can be very complicated, though IChoulakian et al.l (Il994i ) and lLockhart et al.l (120071 ) have 
pointed out that the interaction is somewhat simpler with Cramer's and von Mises', Wat- 
son's, and some of Anderson's and Darling's very similar statistics. That said, the Euclidean 
distance can be more reliable even when there are no parameters in the model, that is, when 
the model p$ is a single, fixed, fully specified probability distribution; the remainder of the 
present section describes why. 

The basis of the analysis is the following lemma, a reformulation of the fact that the 
expected maximum absolute deviation from zero of the standard B rownian bridge is \Jn /2 • 



ln(2) m .8687 (see, for example, Section 3 of iMarsaglia et al.l . 120031 ) . 



Lemma 2.1. Suppose that m is even and that D^ l > , D^ 2 \ . . . , Z)( m ) form a randomly ordered 
list of m/2 positive ones and m/2 negative ones (with the ordering drawn uniformly at 
random). Then, 



E max 

Kk<m 



3=1 



— > v^72-ln(2) 
in the limit that m — > oo, where (as usual) E produces the expected value. 




(5) 



We denote by p the actual underlying distribution of the n observed i.i.d. draws. We 
denote by Pq the model distribution. We denote by P the empirical distribution of the n 
draws. These are all probability distributions, that is, > 0, p^ > 0, and > for 
j = 1, 2, . . . , m, and (JTJ and © hold. 

Suppose that the actual underlying distribution p^\ p( 2 \ . . . , p^ of the draws is the 
same as the model distribution p^\ pp , . . . , p^; the random variables P@\ . . . , p( m ) 
are then the proportions of n i.i.d. draws from p Q that fall in the respective m bins. The 
Euclidean distance is 



U 



The Kolmogorov-Smirnov statistic is 



\ 



(6) 



V = max 

l<k<m 



tih 



(7) 



The expected value of the square of the Euclidean distance is 

m m (j) 

e u 2 = E ( pU) - p { o j) ) 2 = £7 



1 

n 
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As shown, for example, by iDurbinl (119721 ) using Lemma [2~T1 above, the expected value of yfn 
times the Kolmogorov-Smirnov statistic is 



E V Vn ->■ \fn/2 ■ ln(2) « .8687 



(9) 



in the limit that n — > oo and maxi<j< m — > 0. Comparing (jSj) and (j9J), we see that Z7 
and are roughly the same size (inversely proportional to y/n) when the actual underlying 
distribution of the draws is the same as the model distribution. 

However, when the actual underlying distribution of the draws differs from the model 
distribution, the Euclidean distance and the Kolmogorov-Smirnov statistic can be very dif- 
ferent. If the number n of draws is large, then the empirical distribution P will be very 
close to the actual distribution p. Therefore, to study the performance of the goodness-of-fit 
statistics as n — > oo when the actual distribution p differs from the model distribution po 
(and both are independent of n), we can focus on the difference between p and po (rather 
than the difference between P and po). We now define and study the difference 



d U) = p U) _ p co 



(10) 



for j — 1, 2, . . . , m. The Euclidean distance between p and po (the root-sum-square differ- 
ence) is 



u 



\ 



3=1 



(11) 



The Kolmogorov-Smirnov statistic (the maximum absolute cumulative difference) is 



v = max 

Kk<m 



3=1 



(12) 



For simplicity (and because the following analysis generalizes straightforwardly), let us 
consider the illustrative case in which \d^\ = \S 2S) \ = ■ ■ ■ = \S m >\, that is, 



3)1 



(13) 



for all j — 1, 2, . . . , m, where c m is a positive real number (c m must always satisfy m-c m < 2, 

since m-c m = J™=i c m = E"l l^l < Ef=i\P U) + pf] = 2 )- Combining (HDJ, ©, and © 
yields that 

m 

d U) =0. (14) 

3=1 

Together, ( |T4|) and ( 1T3|) imply that m is even and that half of S 2 \ . . . , are equal 
to +c m , and the other half are equal to — c m . 

Combining (fTBl and (iTTj) yields that the Euclidean distance is 



u 



m ■ c„ 



(15) 



The fact that half of d {2 \ . . . , d (m) are equal to +c m , and the other half are equal to 
— Cm, yields that the Kolmogorov-Smirnov statistic t> defined in ( IT2l) could be as small as c m 
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or as large as m ■ c m /2, depending on the ordering of the signs in S 2 \ . . . , d^ m \ If all 
orderings are equally likely (which is equivalent to ordering the bins uniformly at random), 
then by Lemma \2. II the mean value for v is ^m,Tc/2 ■ ln(2) • c m ~ y/m ■ .8687 • c m in the limit 
that m is large (this is the expected maximum absolute deviation from zero of a tied-down 
random walk with m steps, each of length c m , that starts and ends at zero; the random walk 
ends at zero due to (TH|) ). 

Thus, in the limit that the number n of draws is large (and maxi<j< m pQ^ — > 0, while 
both the model p and the alternative distribution p are independent of n), the Euclidean 
distance and the Kolmogorov-Smirnov statistic have similar statistical power on average, if 
all orderings of the bins are equally likely. However, the Euclidean distance is the same for 
any ordering of the bins, whereas the power of the Kolmogorov-Smirnov statistic depends 
strongly on the ordering. We see, then, that the Euclidean distance is more reliable than 
the Kolmogorov-Smirnov statistic when there is no especially natural ordering for the bins. 

Remark 2.2. It is possible to use an ordering for which the Kolmogorov-Smirnov statistic 
attains its greatest value (this corresponds to renumbering the bins such that the differences 
jjU) = pU) _ pO') satisfy £)(i) > £)(2) > . > £)H or D (i) < D (2) < . . . < £)M)_ However, 

this data-dependent ordering produces a statistic which is proportional to the I 1 distance 
J2f=i l-D^I (whereas t he Eu clidean distance is the I 2 distance), as remarked at the top of 



page 396 of iHoeffdind (119651 ). The resulting statistic is no longer cumulative. 



3 The case when the bins have a natural order 

The Kolmogorov-Smirnov statistic is often preferable to the Euclidean distance when there is 
a natural ordering of the bins. In fact, the Kolmogorov-Smirnov statistic is always preferable 
when the data is very sparse and there is a natural ordering of the bins. In the limit that the 
maximum expected number of draws per bin tends to zero, the Euclidean distance always 
takes the same value under the null hypothesis, providing no discriminative power: indeed, 
when the draws producing the empirical distribution P are taken from the model distribution 
p , the Euclidean distance is almost surely l/x/n, 



^(P0-)-pW)2= (16) 

in the limit that n ■ m.&Xi<j< m p\] } — > (the reason is that, in this limit, max 1 < 7 -< m pQ J ' ) — > 
and moreover almost every realization of the experiment satisfies that, for all j = 1, 2, 
. . . , m, PW) = or ?W = 1/n, that is, there is at most one observed draw per bin). In 
contrast, the Kolmogorov-Smirnov statistic is nontrivial even in the limit that the maximum 
expected number of draws per bin tends to zero — in fact, this is exactly the continuum limit 
for the original Kolmogorov-Smirnov statistic involving continuous cumulative distribution 
functions (as opposed to the discontinuous cumulative distribution functions arising from 
the discrete distributions considered in the present paper). Furthermore, the Kolmogorov- 
Smirnov statistic is sensitive to symmetry (or asymmetry) in a distribution, and can detect 
other interesting properties of distributions that depend on the ordering of the bins. 



\ 
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4 Data analysis 



This section gives four examples illustrating the performance of the Kolmogorov-Smirnov 
statistic and the Euclidean distance in various circumstances. The Kolmogorov-Smirnov 
statistic is more powerful than the Euclidean distance in the first two examples, for which 
there are natural orderings of the bins. The Euclidean distance is more reliable than the 
Kolmogorov-Smirnov statistic in the last two examples, for which any ordering of the bins 
is necessarily rather arbitrary. We computed all P -values via M o nte-Ca rlo simulations 



wi th guaranteed error bounds, as in Remark 3.3 of iPerkins et al.l (l2011al ). Remark 3.4 



of IPerkins et al.l (j2011al ) proves that the standard error of the estimate for a P- value P is 



a/P(1 — P) /£, where I is the number of simulations conducted to calculate the P-value. 
4.1 A test of randomness 

A particular random number generator is supposed to produce an integer from 1 to 2 32 
uniformly at random. The model distribution for such a generator is 

■pf = 2~ 32 (17) 

for j = 1, 2, ... , 2 32 . We test the (obviously poor) generator which produces the numbers 
1, 2, 3, . . . , n, in that order, so that the observed distribution of the generated numbers is 

'■'■' = <! v ?' J = 1 ' 2 v' n . as) 



i 0, j =n + l, n + 2, 2 32 

for j = 1, 2, . . . , 2 32 . For these observations, the P-value for the Euclidean distance is 1 
to several digits of precision, while the P-value for the Kolmogorov-Smirnov statistic is to 
several digits, at least for n between a hundred and a million. So, as expected, the Euclidean 
distance has almost no discriminative power for such sparse data, whereas the Kolmogorov- 
Smirnov statistic easily discerns that the data (fl8l) is inconsistent with the model (I17p . 

Remark 4.1. Like the Euclidean distance, classical goodness-of-fit statistics such as x 2 > 
G 2 (the log-likelihood-ratio), and the Freeman- Tukey/Hellinger distance are invariant to 
the ordering of the bins, and also produce P-values that are equal to 1 to several digits 
of precision, at least for n between a hundred and a million. For definitions and furthe r 



discussion of the x 2 , G 2 , and Freeman- Tukey statistics, see Section 2 of lPerkins et al.l (]2011al ). 



4.2 A test of Poissonity 

A Poisson-distributed random number generator with mean 100 is supposed to produce a 
nonnegative integer according to the model 

v U) - 10QJ (19) 
Po ~j!-exp(100) 

for j = 0, 1, 2, 3, . . . . We test the (obviously poor) generator which produces the numbers 
100, 101, 102, . . . , 109, so that the observed distribution of the numbers is 



(i) j 1/10, j = 100, 101, 102,..., 109 
' ' 0, otherwise 
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for j = 0, 1, 2, 3, The P-values, each computed via 4,000,000 simulations, are 



• Kolmogorov-Smirnov: .0075 

• Euclidean distance: .998 

• x 2 - -999 

• G 2 (the log-likelihood-ratio): .999 

• Freeman- Tukey (the Hellinger distance): .998 

For defin itions and further dis cussion of the x 2 \ G 2 , and Freeman- Tukey statistics, see Sec- 



tion 2 of iPerkins et al.l (]2011al ). The Kolmogorov-Smirnov statistic is far more powerful for 
this example, in which the bins have a natural ordering (in this example the bins are the 
nonnegative integers). 

Figure [1] plots the model probabilities p^ , p^ , p^ , . . . defined in (fl9|) along with the 
observed proportions p^°\ fr x \ p^ 2 \ . . . defined in (l20l) . Figure |2] plots the model probabilities 
Pq '', p^\ p^\ . . . along with analogues of the proportions p(°\ p^ 2 \ ... for a simulation 
generating 10 i.i.d. draws according to the model. 

Figure [3] plots the cumulative model probabilities p^ , Po +Po > P^ + Po + Pq\ ■■■ 
along with the cumulative observed proportions p(°\ p(°> + p^\ p(°> + p( l > + p^ 2 \ .... 
Figure H] plots the cumulative model probabilities p^\ p^ + p^\ + p<p + ■■■ 
along with analogues of the cumulative proportions p^ , p^ + p^ 1 ' , p(°' + p^ + p^ , ... 
for the simulation generating 10 i.i.d. draws according to the model. 



4.3 A test of Hardy- Weinberg equilibrium 



In a population with suitably random mating, the proportions of pairs of Rhesus haplotypes 
in members of the population (ea ch member has one pair) c an be expected to follow the 
Hardy- Weinberg law discussed by iGuo and Thompson! (119921 ) . namely to arise via random 
sampling from the model 



rf' fc) ( 



h, #2, 



A) 



2-e r e k , j>k 

(0 k ) 2 , j = k 



(21) 



for j, k = 1, 2, 



9 with j > k, under the constraint that 



5> 



(22) 



where the parameters 9\, 8 2 , . . . , 69 are the proportions of the nine Rhesus haplotypes in 
the population (naturally, their maximum-likelihood estimates are the proportions of the 
haplotypes in the given data). For j, k = 1, 2, . . . , 9 with j > k, therefore, Po is the 
expected probability that the pair of haplotypes in the genome of an individual is the pair 
j and k, given the parameters 1( Q 2 , ■ ■ ■ , 9 . 
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Figure 1: Proportions associated with the bins for the observations 




Figure 2: Proportions associated with the bins for a simulation 
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Figure 3: Cumulative proportions associated with the bins for the observations 
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Figure 4: Cumulative proportions associated with the bins for the simulation from Figure [2] 
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In this formulation, the hypothesis of suitably random mating entails that the members 
of the sample population are i.i.d. draws from the model specified in ( l2~Tj) ; if a goodness-of-fit 
statistic rejects the model with high confidence, then we can be confident that mating has 
not been suitably random. 



Table [Uprovides data onn = 8297 individuals; we duplicated Figure 3 of lGuo and Thompson 



( 119921 ) to obtain Tabled] Figure |5] plots the associated P- values, each computed via 90,000 
Monte-Carlo simulations. The Kolmogorov-Smirnov statistic depends on the ordering of the 
bins; for the first trial t = 1 in Figure the order of the bins is the lexicographical ordering, 
namely (1, 1), (2, 1), (2, 2), (3, 1), (3, 2), (3, 3), . . . , (9, 9). The nine trials t = 2, 3, . . . , 10 
displayed in Figure [5]use pseudorandom orderings of the bins. Please note that the Euclidean 
distance does not depend on the ordering. 

Generally, a more powerful statistic produces lower P- values. In Figure EJ the P- values 
for the Kolmogorov-Smirnov statistic are sometimes lower, sometimes higher than the P- 
values for the Euclidean distance. There is no particularly natural ordering of the bins 
for Figure |5j Figure [5] displays 10 different orderings corresponding to 10 different trials. 
Figure |5] demonstrates that the Euclidean distance is more reliable than the Kolmogorov- 
Smirnov statistic when there is no natural ordering (or partial order) for the bins. 

Remark 4.2. The P-values for classical goodness-of-fit statistics are substantially higher; 
the classical statistics are less powerful for this example. The P-values, each computed via 
4,000,000 Monte-Carlo simulations, are 

• Euclidean distance: .039 

• x 2 - -693 

• G 2 (the log-likelihood-ratio): .600 

• Freeman- Tukey (the Hellinger distance): .562 

For definiti ons and further discu ssion of the \ 2 i G 2 , and Freeman- Tukey statistics, see Sec- 



tion 4.5 of iPerkins et al.l ( I2011a[ ). Like the Euclidean distance, the \ 2 i G 2 , and Freeman- 



Tukey statistics are all invariant to the ordering of the bins. 



4.4 A test of uniformity 

Table duplicates Table 1 of iGilchristl (l2010f ). giving the colors of the n = 62 pieces of candy 
in a 2.17 ounce bag. Figure E] plots the P-values for Table [2]to be consistent up to expected 
random fluctuations with Table [31 the model of uniform proportions. We computed each P- 
value via 4,000,000 Monte-Carlo simulations. The Kolmogorov-Smirnov statistic depends on 
the ordering of the bins; the ten trials t — 1, 2, . . . , 10 displayed in Figure[6]use pseudorandom 
orderings of the bins. The Euclidean distance does not depend on the ordering. 

Generally, a more powerful statistic produces lower P-values. In Figure El the P-values 
for the Kolmogorov-Smirnov statistic are sometimes lower, sometimes higher than the P- 
values for the Euclidean distance. There is no particularly natural ordering of the bins for 
Table |3j Figure |6] displays 10 different pseudorandom orderings corresponding to 10 different 
trials. Figure |6] illustrates that the Euclidean distance is more reliable than the Kolmogorov- 
Smirnov statistic when there is no natural ordering (or partial order) for the bins. 
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Table 1: Frequencies of pairs of Rhesus haplotypes 
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Figure 5: P- values for Tabled] to be consistent with formula ( }2Tj) 



Table 2: Observed frequencies of colors of candies in a 2.17 ounce bag 



color red orange yellow green violet 
number 15 9 14 11 13 



Table 3: Expected frequencies of colors of candies in a 2.17 ounce bag 



color red orange yellow green violet 
number 12.4 12.4 12.4 12.4 12.4 



Remark 4.3. Table [2] provides a possible means for ordering the bins. However, such an 
ordering will depend on the observed data. Using a data- dependent ordering can profoundly 
alter the nature of the goodness-of-flt statistic; see Remark 12.21 

Remark 4.4. Like the Euclidean distance, many classical goodness-of-fit statistics are in- 
variant to the ordering of the bins. The following are P- values, each computed via 4,000,000 
Monte-Carlo simulations: 

• Euclidean distance: .770 

• x 2 - -770 

• G 2 (the log-likelihood-ratio): .766 

• Freeman- Tukey (the Hellinger distance): .755 

For defi nitions and further dis cussion of the x 2 , G 2 , and Freeman- Tukey statistics, see Sec- 



tion 2 of iPerkins et al.l ( ]2011al ). For this example, the Euclidean distance and the x 2 statistic 
produce exactly the same P- values: for the model of homogeneous proportions, displayed in 
Table [3J the Euclidean distance is directly proportional to the square root of the \ 2 statistic, 
and hence the Euclidean distance is a strictly increasing function of \ 2 - 
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Figure 6: P- values for Table [2] to be consistent with the model displayed in Table |3] 
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